Conference call and mobile communication devices that participate in a conference call

ABSTRACT

A first mobile communication device that includes a first microphone, a first speaker, and a first delay unit. The first microphone is configured to (i) receive, during a conference call, a first user first microphone signal from a first user, and (ii) output a first microphone digital signal to the first delay unit. The first user first microphone signal represents audio content outputted by the first user. The first delay unit is configured to delay, by a delay period, the first microphone digital signal to provide a delayed first user first device digital signal. The first mobile communication device is configured to output, to a mixer, the delayed first user first device digital signal. The delay period is determined based on measurements executed by at least one mobile communication device out of the first mobile communication device, a second mobile communication device and a third mobile communication device.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation application and claims the benefit ofU.S. patent application Ser. No. 17/812,168 entitled “CONFERENCE CALLAND MOBILE COMMUNICATION DEVICES THAT PARTICIPATE IN A CONFERENCE CALL”and filed on Jul. 12, 2022, which is a continuation application andclaims the benefit of U.S. patent application Ser. No. 15/454,012entitled “CONFERENCE CALL AND MOBILE COMMUNICATION DEVICES THATPARTICIPATE IN A CONFERENCE CALL” and filed on Mar. 9, 2017, whichclaims the benefit of U.S. Provisional Patent Application No. 62/306,101entitled “CONFERENCING AND HOWLING SYSTEM” and filed on Mar. 10, 2016,all of which are assigned to the assignee hereof. The disclosures of allprior Applications are considered part of and are incorporated byreference in this Patent Application.

BACKGROUND

There is a growing need to allow people to conduct conference calls in acost-effective manner.

SUMMARY

Methods, systems, and mobile communication devices as illustrated in thespecification and/or the claims.

There may be provided a first mobile communication device that includesa first microphone, a first speaker and a first delay unit. The firstmicrophone is configured to (i) receive, during a conference call, afirst user first microphone signal from a first user, and (ii) output afirst microphone digital signal to the first delay unit; wherein thefirst user first microphone signal represents audio content outputted bythe first user. The first delay unit is configured to delay, by a delayperiod, the first microphone digital signal to provide a delayed firstuser first device digital signal. The first mobile communication deviceis configured to output, to a mixer, the delayed first user first devicedigital signal. The delay period is determined based on measurementsexecuted by at least one mobile communication device out of the firstmobile communication device, a second mobile communication device and athird mobile communication device. The mixer is in communication withthe first mobile communication device, the second mobile communicationdevice, and the third mobile communication device.

There may be provided a method for participating in a conference call,the method may include receiving a first user first microphone signalfrom a first user, by a first microphone of a first mobile communicationdevice and during the conference call; outputting, by the firstmicrophone, a first microphone digital signal to a first delay unit ofthe first mobile communication device; wherein the first user firstmicrophone signal represents audio content outputted by the first user;delaying the first microphone digital signal, by the first delay unitand by a delay period, to provide a delayed first user first devicedigital signal; wherein the delay period is determined based onmeasurements executed by at least one mobile communication device out ofthe first mobile communication device, a second mobile communicationdevice and a third mobile communication device; and outputting, by thefirst mobile communication device, the delayed first user first devicedigital signal to a mixer; wherein the mixer is in communication withthe first mobile communication device, the second mobile communicationdevice, and the third mobile communication device.

There may be provided a non-transitory computer readable medium thatstores instructions for participating in a conference call, theinstructions causing a first mobile communication device to: receive bya first microphone of the first mobile communication device, during theconference call, a first user first microphone signal from a first user;output, by the first microphone, a first microphone digital signal to afirst delay unit of the first mobile communication device; wherein thefirst user first microphone signal represents audio content outputted bythe first user; delay, by the first delay unit and by a delay period,the first microphone digital signal to provide a delayed first userfirst device digital signal; wherein the delay period is determinedbased on measurements executed by at least one mobile communicationdevice out of the first mobile communication device, a second mobilecommunication device and a third mobile communication device; andoutput, by the first mobile communication device, the delayed first userfirst device digital signal to a mixer; wherein the mixer is incommunication with the first mobile communication device, the secondmobile communication device, and the third mobile communication device.

BRIEF DESCRIPTION OF THE DRAWINGS

It will be appreciated that for simplicity and clarity of illustration,elements shown in the figures have not necessarily been drawn to scale.For example, the dimensions of some of the elements may be exaggeratedrelative to other elements for clarity. Further, where consideredappropriate, reference numerals may be repeated among the figures toindicate corresponding or analogous elements.

FIG. 1A illustrates three persons that participate in a conference call,a mixer and three devices according to an embodiment of the invention;

FIG. 1B illustrates three persons that participate in a conference call,a network and three devices according to an embodiment of the invention;

FIG. 1C illustrates three persons that participate in a conference call,a connecting network, a near-end network and three devices according toan embodiment of the invention;

FIG. 2A illustrates a third person that talks during conference call, amixer, three devices and various signals generated during the conferencecall according to an embodiment of the invention;

FIG. 2B illustrates various components and one or more compensationmodules for performing compensation operations related to some of thesignals of FIG. 2A according to an embodiment of the invention;

FIG. 3A illustrates a first person that talks during conference call, amixer, three devices and various signals generated during the conferencecall according to an embodiment of the invention;

FIG. 3B illustrates various components and one or more compensationmodules for performing compensation operations related to some of thesignals of FIG. 3A according to an embodiment of the invention;

FIG. 4A illustrates a third person that talks during conference call, amixer, three devices and various signals generated during the conferencecall according to an embodiment of the invention;

FIG. 4B illustrates various components and one or more compensationmodules for performing compensation operations related to some of thesignals of FIG. 4A according to an embodiment of the invention;

FIG. 5A illustrates a third person that talks during conference call, amixer, three devices and various signals generated during the conferencecall according to an embodiment of the invention;

FIG. 5B illustrates various components and one or more compensationmodules for performing compensation operations related to some of thesignals of FIG. 5A according to an embodiment of the invention;

FIG. 6A illustrates a first person that talks during conference call, amixer, three devices and various signals generated during the conferencecall according to an embodiment of the invention;

FIG. 6B illustrates various components and one or more compensationmodules for performing compensation operations related to some of thesignals of FIG. 6A according to an embodiment of the invention;

FIG. 7A illustrates a first person that talks during conference call, amixer, three devices and various signals generated during the conferencecall according to an embodiment of the invention;

FIG. 7B illustrates various components and one or more compensationmodules for performing compensation operations related to some of thesignals of FIG. 7A according to an embodiment of the invention;

FIG. 7C illustrates various components and one or more compensationmodules for performing compensation operations related to some of thesignals of FIG. 7A according to an embodiment of the invention;

FIG. 8A illustrates a first person that talks during conference call, amixer, three devices and various signals generated during the conferencecall according to an embodiment of the invention;

FIG. 8B illustrates various components and one or more compensationmodules for performing compensation operations related to some of thesignals of FIG. 8A according to an embodiment of the invention;

FIG. 8C illustrates various components and one or more compensationmodules for performing compensation operations related to some of thesignals of FIG. 8A according to an embodiment of the invention;

FIG. 9A illustrates various components and one or more compensationmodules of a first device of a first person according to an embodimentof the invention;

FIG. 9B illustrates various components and one or more compensationmodules of a first device of a first person according to an embodimentof the invention;

FIG. 9C illustrates various components and one or more compensationmodules of a first device of a first person according to an embodimentof the invention;

FIG. 9D illustrates various components and one or more compensationmodules of a first device of a first person according to an embodimentof the invention;

FIG. 10A illustrates a third person that talks during conference call, anear end network, a connecting network, three devices and varioussignals generated during the conference call according to an embodimentof the invention;

FIG. 10B illustrates a person that talks during conference call, a nearend network, a connecting network, three devices and various signalsgenerated during the conference call according to an embodiment of theinvention;

FIG. 11A illustrates a third person and a first person that talk duringconference call, a mixer, three devices that include a second devicethat has a turned off microphone, and various signals generated duringthe conference call according to an embodiment of the invention;

FIG. 11B illustrates a third person and a first person that talk duringconference call, a mixer, three devices that include a second devicethat has a turned off speaker, and various signals generated during theconference call according to an embodiment of the invention;

FIG. 11C illustrates a third person and a first person that talk duringconference call, a mixer, three devices that include a second devicethat has a turned off microphone and a first device with a turned offspeaker, and various signals generated during the conference callaccording to an embodiment of the invention;

FIG. 11D illustrates a third person and a first person that talk duringconference call, a mixer, three devices that include a second devicethat has a turned off microphone, a third device with a turned offmicrophone, and a first device with a turned off speaker, as well asvarious signals generated during the conference call according to anembodiment of the invention; and

FIG. 12A illustrates a second person that talks during conference call,a mixer, three devices, and various signals generated during theconference call according to an embodiment of the invention.

DETAILED DESCRIPTION OF THE DRAWINGS

In the following detailed description, numerous specific details are setforth in order to provide a thorough understanding of the invention.However, it will be understood by those skilled in the art that thepresent invention may be practiced without these specific details. Inother instances, well-known methods, procedures, and components have notbeen described in detail so as not to obscure the present invention.

The term “comprising” is synonymous with (means the same thing as)“including,” “containing” or “having” and is inclusive or open-ended anddoes not exclude additional, unrecited elements or method steps.

The term “consisting” is a closed (only includes exactly what is stated)and excludes any additional, unrecited elements or method steps.

The term “consisting essentially of” limits the scope to specifiedmaterials or steps and those that do not materially affect the basic andnovel characteristics.

In the claims and specification any reference to the term “comprising”(or “including” or “containing”) should be applied mutatis mutandis tothe term “consisting” and should be applied mutatis mutandis to thephrase “consisting essentially of”.

In the claims and specification any reference to the term “consisting”should be applied mutatis mutandis to the term “comprising” and shouldbe applied mutatis mutandis to the phrase “consisting essentially of”.

In the claims and specification any reference to the phrase “consistingessentially of” should be applied mutatis mutandis to the term“comprising” and should be applied mutatis mutandis to the term“consisting”.

Any reference in the specification to a method should be applied mutatismutandis to a system capable of executing the method.

Any reference in the specification to a system should be applied mutatismutandis to a method that may be executed by the system.

The terms “cancellation”, “suppression” are used in an interchangeablemanner.

The term “substantially” or “about” can refer to an accuracy (ordeviation) of any value between 1 and 20 percent.

The term “proximate” may refer to a range of distances that may span,for example, between a fraction of a millimeter and less than 5centimeters.

Any combination of any components of any of the devices and/or systemsillustrated in any of the figures may be provided.

Any device, network, mixer, module and/or system that is illustrated inany of the figures may include additional components, may includealternative components, may include fewer components, may be limited tothe components illustrated in the figure or may be essentially limitedto the components illustrated in the figure.

There is provide a method for conducting conference calls using mobiledevices and an additional mixer and/or network that may convey signalsbetween the mobile device.

The mobile device may be a smartphone. Baby monitor, mobile devices usedby public security officers, mobile gaming consoles, but any othermobile device that includes a speaker, a microphone and compensationmodules (as illustrated below) can be used for conducting conferencecalls.

In various figures three persons participate in the conference call,wherein the first and second persons are at the same space/can be heardby each other. This is only for brevity of explanation. The number ofparticipants in a conference call may exceed three. The participants maybe arranged in other manners. For example—more than three participantsmay be located in the same room and more than a single participant maybe located at another room.

Some of the various figures illustrates a single talking participant andother figures illustrated two persons that talk. It is noted that anycombination of participants may talk at the same time.

In the various figures the first and second persons 11 and 12 arelocated at the same room and may be referred to as near endparticipants. The third person is located at another room and isreferred to as a far end participant.

It should be noted that the persons may be located at any location—in orout a room.

The following signals illustrate various signals. The signals were namedby names that may refer to (a) the person that either generated thesignal or resulted in the generation of the signal, (b) the device thateither received the signal or generated the signal, (c) a type of thesignal—acoustic, digital, leakage or converted signal.

A converted signal may be outputted from a microphone and represents anacoustic signal received by the microphone. The converted signal mayhave also been processed by one or more modules such as signalprocessing modules including but not limited to filters and the like.

TABLE 1 lists some of the signals that are illustrated in the drawings.For simplicity of explanation the specification may refer to any ofthese signals by their full name (listed in the table) or refer to themas “signals” or “signal.

TABLE 1 # Name remark 1011 First user first Acoustic signal generated bythe first user and microphone received by the first microphone acousticsignal 1012 First user second Acoustic signal generated by the firstuser and microphone received by the second microphone acoustic signal1013 First user second Acoustic signal generated by the first deviceacoustic microphone as a result from a digital leakage leakage signalsignal from the second device through the mixer and to the first device,in response to the acoustic signal generated by the first user 1014First user first Acoustic signal generated by the second device acousticmicrophone as a result from a digital leakage leakage signal signal fromthe first device through the mixer and to the second device, in responseto the acoustic signal generated by the first user 1031 Third user firstAcoustic signal generated by the first speaker speaker acoustic inresponse to the acoustic signal generated signal by the third user 1032Third user Acoustic signal generated by the second second speakerspeaker in response to the acoustic signal acoustic signal generated bythe third user 1041 Third user first Echo of the first device (fromspeaker to device echo microphone) in response to the acoustic signalgenerated by the third user 1041′ Converted third Signal 1041 afterconversion to an electrical user first signal by first microphone deviceecho 1042 Third user Echo of the second device (from speaker to seconddevice microphone) in response to the acoustic echo signal generated bythe third user 1043 First user first Echo of the first device (fromspeaker to device echo microphone) in response to a digital leakagesignal from the second device through the mixer and to the first device,leakage resulting from response to the acoustic signal generated by thefirst user 1044 First user second Echo of the second device (fromspeaker to device echo microphone) in response to a digital leakagesignal from the first device through the mixer and to the second device,leakage resulting from response to the acoustic signal generated by thefirst user 1051 Third user first Cross echo (from second speaker tofirst cross echo microphone) in response to the acoustic signalgenerated by the third user 1052 Third user Cross echo (from firstspeaker to second second cross microphone) in response to the acousticsignal echo generated by the third user 1052′ Converted third Signal1052 after conversion to an electrical user second signal by firstmicrophone echo 1053 First user first Echo from the first device that isreceived by device cross the second device and results from digital echoleakage from the second device via the mixer and to the first device,the echo results from the acoustic signal generated by the first user1054 First user Echo from the second device that is received seconddevice by the first device and results from digital cross echo leakagefrom the first device via the mixer and to the second device, the echoresults from the acoustic signal generated by the first user 1111 Firstuser first Digital signal outputted by the first device in devicedigital response to the acoustic signal generated by the signal firstuser. 1112 First user second Digital signal outputted by the seconddevice in device response to the acoustic signal generated by thedigital signal first user 1131 Third user first Digital signal outputtedby the mixer and device input received by the first device in responseto the signal acoustic signal generated by the third user 1131′Converted third Signal 1131 after conversion to an electrical user firstdevice signal by first microphone input signal 1132 Third user Digitalsignal outputted by the mixer and second device received by the seconddevice in response to input signal the acoustic signal generated by thethird user 1141 First user Digital leakage signal from the first devicesecond digital through the mixer and to the second device, in leakagesignal response to the acoustic signal generated by the first user 1142First user first Digital leakage signal from the second device digitalleakage through the mixer and to the first device, in signal response tothe acoustic signal generated by the first user 1310 First user mixedDigital signal outputted by the mixer in digital signal response to theacoustic signal generated by the first user 1330 Third user thirdDigital signal outputted by the third device in device digital responseto in response to the acoustic signal signal generated by the third user1335 Third user Digital signal outputted by the connecting connectingnetwork in response to in response to the network acoustic signalgenerated by the third user digital signal 1336 First user first Digitalsignal outputted by the first device to device the connecting network inresponse to the connecting acoustic signal generated by the third usernetwork output signal 1337 First user Digital signal outputted by theconnecting connecting network in response to the acoustic signal networkoutput generated by the third user signal 1410 First user third Acousticsignal outputted by the third speaker in device acoustic response to inresponse to the acoustic signal output signal generated by the firstuser 1430 Third user third Acoustic signal generated by the third userand microphone received by the third microphone acoustic signal 1501Third user first Digital signal outputted by the first device to devicenear end the near end network in response to the acoustic networkdigital signal generated by the third user signal 1502 Third user nearDigital signal outputted by the near end network end network to thesecond device in response to the second device acoustic signal generatedby the third user digital signal 1511 First user near Digital signaloutputted by the near end network end network to the first device inresponse to the second device acoustic signal generated by the firstuser digital signal 1512 First user second Digital signal outputted bythe second device device near end to the near end network in response tothe network digital acoustic signal generated by the first user signal1601 Second user Acoustic signal generated by the second user first userand received by the first user (for example- acoustic signal withoutusing any device) 1602 Second user Acoustic signal generated by thesecond user second and received by the second microphone microphoneacoustic signal 1603 Second user Digital signal outputted by the seconddevice in second device response to the acoustic signal generated by thedigital signal second user 1604 Second user Acoustic signal generated bythe second user first microphone and received by the first microphoneacoustic signal 1605 Second user Digital leakage signal from the seconddevice first digital through the mixer and to the first device, inleakage signal response to the acoustic signal generated by the seconduser 1606 Second user Digital signal outputted by the first device infirst device response to the acoustic signal generated by the digitalsignal second user. 1607 Second user Digital signal outputted by themixer in mixed digital response to the acoustic signal generated by thesignal second user 1608 Second user Acoustic signal outputted by thethird speaker in third device response to in response to the acousticsignal acoustic generated by the second user output signal

FIG. 1A illustrates three persons 11, 12 and 13 that participate in aconference call, a mixer 25 and three devices—first device 21, seconddevice 22 and third device 23 that may (or may not) belong to first tillthird persons 11, 12 and 13 respectively.

First device 21 includes first microphone 41, first speaker 31 and firstinput output (I/O) port 51. Second device 22 includes second microphone42 and second speaker 32 and second I/O port 52. Third device 23includes third microphone 43 and third speaker 33 and third I/O port 53.

An input output (I/O) port may be any communication port—especially notan acoustic port. The I/O port may be, for example a wireless or wiredcommunication port. The I/O port may be used for outputting digitalsignals and/or representations of digital signals. The I/O port may beused to convey radio frequency (RF) signals and/or other signals.

FIG. 1A also illustrates the acoustic link 80 between first and secondpersons 11 and 12—as these persons may hear each other without usingdevices.

There may be delays between the transmission and/or reception of signalsbetween the mixer and each one of the first and second devices.

A difference between network delays from the mixer to the first andsecond device is denoted reception delay difference (Drx).

A difference between network delays from the first and second devices tothe mixer is denoted transmitted delay difference (Drx).

FIG. 1A illustrates first Rx delay 91, second Rx delay 92, first Txdelay 93, and second Tx delay 94.

FIG. 1A further illustrates first room 81 in which first and secondpersons, first and second device and mixer are located as well as secondroom 82 in which third person and third device are located.

The mixer may be located elsewhere, and may be replaced by one or morenetworks.

FIG. 1B illustrates three persons 11, 12 and 13 that participate in aconference call, a network 26 and three devices according to anembodiment of the invention. Network of FIG. 1B replaces mixer 25 andhas mixing capabilities for combining the signals from first and seconddevices.

FIG. 1C illustrates three persons 11, 12 and 13 that participate in aconference call, a connecting network 28, a near-end network 27 andthree devices 21, 22 and 23 according to an embodiment of the invention.

In FIG. 1C the connecting network 28 (may be a long-range network suchas the Internet or any long-range network) communicates with the firstdevice and may not communicate with the second device. The near-endnetwork 27 may relay signals (for example digital signals) between firstand second devices.

Near-end network may be a short-range network such as but not limited toBluetooth, BLE, WI-FI, PROSE Relay from PSLTE, voice over IP etc.

FIG. 2A illustrates a third person 13 that talks during conference call,a mixer 25, three devices 21, 22 and 23 and various signals (1430, 1330,1131, 1132, 1031 and 1032) generated during the conference callaccording to an embodiment of the invention.

As indicated above there may be a delay difference of Drx between signal1131 and signal 1132. Unless compensated, signal 1031 and signal 1032may suffer from a delay difference of Drx.

The perceived speech quality and intelligibility decreases due to Drx.

Drx should be estimated and compensated in each near-end device or atleast in one of the first and second devices

Drx may be estimated in various manners. For example—Drx may beestimated by cross correlation between signals 1031 and 1032 which arereceived by the microphones of both devices as echo. Additionally oralternatively, the correlation between signals 1031 and 1131 in thefirst device can be used to estimate the delay.

In circuit switched networks, this delay can be constant. In packetswitched network (e.g. VOIP, VOLTE) the delay can be time variant andestimated continuously

Estimation module can get info from jitter buffer module that managesdynamic delays to avoid packet losses.

The jitter buffer mentioned above may be in the communication processor,in the Rx path, as an interface to the network. It is used tosynchronized the order of received packets from the networks. The jitterbuffer may be used to apply dynamic buffering to get rid of networkdelay variations.

Each one of the mentioned above or below AEC modules may operate on thedigitally sampled audio signals of the communication device. Thetransfer function of the acoustic environment from the loudspeaker tothe microphone on the device is estimated to cancel the received echoesfrom the microphone signal. The AEC may be or may include an adaptivefilter. An adaptive filter is used in voice echo cancellation toaccommodate the time varying nature of the echo path. The filter learnsthe path when the far-end speaker is talking and the near-end speaker issilent and adjusts its coefficients (transfer function) according to thealgorithm optimization criterion.

Any AFC module may apply Non-Linear Processing. Non-linear processing isthe removal of residual echo left by the adaptive filter (echocancellation). Residual echoes are the un-modeled components of the echopath. Most adaptive filters are linear and can only cancel the linearportions of the echo path. Thus the nonlinear portions cannot be removedvia the adaptive filter and a residual echo suppressor follows thefilter to handle nonlinear portions of the echo that remain.

FIG. 2B illustrates various components (such as first microphone 41 andfirst speaker 31) and one or more compensation modules (such as delayunit Drx 62 and Drx estimation module 61) for performing compensationoperations related to signals of FIG. 2A according to an embodiment ofthe invention.

Drx estimation module 61 may estimate Drx using, for example correlationbetween the converted third user second cross echo 1052′ and theconverted third user first device echo (denoted 1041′ in FIG. 4B).

Additionally or alternatively, the correlation between signals 1031 and1131 in the first device can be used to estimate the delay.

Delay unit Drx 62 may delay the converted third user first device echo1041′ before it is fed to first speaker in order to generate the thirduser first speaker acoustic signal 1031 that is Drx compensated.

The Drx compensation may be performed by the first device, by the seconddevice or by both first and second devices.

FIG. 3A illustrates a first person that talks during conference call, amixer, three devices and various signals (1011, 1012, 1111, 1112, 1310and 1410) generated during the conference call according to anembodiment of the invention.

There may be a delay difference of Dtx between signals 1111 and 1112.Without Dtx compensation the mixer 25 may mix time shifted signals andthe third person will eventually receive a mixture of time shiftedsignals. The perceived speech quality and intelligibility decreases dueto Dtx.

Dtx should be estimated and compensated in at least one of the first andsecond devices.

The first and second device may not be aware of Dtx but may estimate Dtxbased on Drx.

The third device may can apply de-reverberation techniques to solve theproblem.

The third device may estimate Dtx by analyzing signal 1310 and may sendfeedback to the first and second devices.

FIG. 3B illustrates various components (such as first microphone 41 andfirst speaker 31) and one or more compensation modules (such as delayunit Dtx 64 and Dtx estimation module 63) for performing compensationoperations related to signals of FIG. 3A according to an embodiment ofthe invention.

Dtx estimation module 63 may estimate Trx using, for example correlationbetween the following signals—the converted third user second cross echo1052′ and the converted third user first digital input signal 1131′.

Delay unit Trx 64 may delay a converted first user first microphoneacoustic signal (converted by microphone from first user firstmicrophone acoustic signal 1011) before it is fed to mixer in order togenerate a first user first device digital signal that is Trxcompensated.

The Trx compensation may be performed by the first device, by the seconddevice or by both first and second devices.

FIG. 4A illustrates a third person 13 that talks during conference call,a mixer 25, three devices 21, 22 and 23 and various signals (1430, 1330,1131, 1132, 1041 and 1042) generated during the conference callaccording to an embodiment of the invention.

In a hand-free mode there may be an echo due to the acoustic couplingbetween the microphone and the speaker of each device.

FIG. 4B illustrates various components (such as first microphone 41 andfirst speaker 31) and one or more compensation modules (first echocancellation unit (AEC1) 65) for performing compensation operationsrelated to some of the signals of FIG. 4A according to an embodiment ofthe invention.

AEC1 65 may apply one or more linear and/or non-linear echo cancellationprocesses. Additionally or alternatively, AEC1 may apply linear and/ornon-linear echo suppression processes. Non-linear echo components mayresult from non-linearity of the speaker transfer function and/or fromvibrations of the device.

The third user first device echo 1041 is received by first microphoneand is converted to the converted third user first device echo 1041′.The third user first device echo 1041 is generated by first speaker 31in response to third user first device digital input signal 1131. AEC1performs echo cancellation after being fed by third user first devicedigital input signal 1131 and converted third user first device echo1041′. The output of AEC1 65 is sent from first I/O port 51 to mixer 25.

FIG. 5A illustrates a third person 13 that talks during conference call,a mixer 25, three devices 21, 22 and 23 and various signals (1430, 1330,1132, 1131, 1051 and 1052) generated during the conference callaccording to an embodiment of the invention.

A cross acoustic echo problem exists due to acoustic coupling betweenthe speaker or one device and a microphone of another device. This isespecially true when the first and second devices are expected to be atthe same room and relatively proximate to each other (for example—lessthan 10 meters)—as the first and second persons need to use thesedevices during the conference call and may also need to hear each otherwithout the aid of these devices.

The cross echo may be cancelled using at least one linear and/ornon-linear echo cancel processes and at least one linear and/ornon-linear echo suppression processes.

In addition, Drx should be compensated in the Rx path.

The non-linear cross echo is due to speaker non-linearity and non-lineardifferences due transmission links (between the first and seconddevices, the mixer and the third device) that may include vocoders andtime-variant network delays.

Direct echo cancellation (AEC₁) and cross echo cancellation (AEC₂) canbe done jointly using the same module when the delay is compensatedcorrectly.

For example, assuming signal 1131 and signal 1132 are the same but withdelay difference because of different network delays. First microphone21 gets both the echo of signal 1131 and an echo of signal 1132. Sincesignal 1131 and signal 1132 are the same with different delays, we canuse a joint AEC to cancel jointly the different couplings using signal1131 as reference to AEC.

FIG. 5B illustrates various components (such as first microphone 41 andfirst speaker 31) and one or more compensation modules (such as secondecho cancellation unit (AEC2) 66, delay unit Drx 62 and Drx estimationmodule 61) for performing compensation operations related to some of thesignals of FIG. 5A according to an embodiment of the invention.

The first microphone 41 may sense third user second cross echo 1052 andoutput the converted third user second cross echo 1052′. The convertedthird user second cross echo 1052′ may be fed to AEC 2.

FIG. 6A illustrates a first person that talks during conference call, amixer, three devices and various signals generated during the conferencecall according to an embodiment of the invention.

When the first person (or the second person) talks his speech (acousticsignal) is captured by first and second microphones of first and seconddevices (signals 1011 and 1012), causes signals 1111 and 1112 to beoutputted from first and second devices to be sent to mixer and causessignals 1141 and 1142 to reach second and first devices respectively.These signals cause first and second speakers to output signals 1013 and1014.

The first and second persons perceives the acoustic leakage signals asunwanted delayed echo and can make the participation in the conferencecall impossible or non-tolerable.

The acoustic leakage signals may be cancelled by AEC3 using at least onelinear and/or non-linear echo cancel processes and at least one linearand/or non-linear echo suppression processes.

AEC3 is the same module with AEC1. (Use of AEC3 is the most innovativepart of the patent)

AEC3 may perform the echo cancellation and/or suppression using thecorrelation between the original speech (signals 1011 and 1012) and theoutput acoustic leakage signals 1013 and 1014.

AEC3 can be implemented a correlator as feedback detector and attenuatorwhenever feedback exists.

AEC3 may benefit from receiving an estimate of Dn. Dn may be estimatedusing cross correlation or any other method and need to be compensated.

Dn is the time for the transmission of signals 1012, 1112 and 1142 usingthe fact that signal 1011 equals signal 1012. It is estimated bycorrelating converted versions of signals 1011 and 1142.

FIG. 6B illustrates various components (such as first microphone 41 andfirst speaker 31) and one or more compensation modules (Third echocancellation unit (AEC3) 67, Delay unit Dn 69 and Dn estimation module68) for performing compensation operations related to some of thesignals of FIG. 6A according to an embodiment of the invention.

The output of first microphone 41 is fed to delay unit Dn that sends adelayed signal (delayed by Dn) to AEC3 67. AEC3 also receives signal1111 and perform echo cancellation and/or echo suppression to provide anoutput signal to speaker.

Dn estimation module 68 may estimate Dn by correlating signals 1111 and1142.

AEC2 may be the same as AEC1 and the same as AEC3. Regarding AEC3-AEC3manipulates the input signal to the speaker and can even mute thespeaker, because the first and second users may hear each other withoutthe assistance of their devices—and thus can improve the echocancellation in a substantial manner.

FIG. 7A illustrates a first person that talks during conference call, amixer, three devices and various signals (1011, 1012, 1111, 1112, 1141,1141, 1310 and 1410) that are generated during the conference callaccording to an embodiment of the invention.

When first person talks and if the feedback is not cancelled with AEC₃or there exists some residual feedback, an echo (such as signals 1043and 1044) from a speaker is received by the microphone of the samedevice. The content of the residual feedback depends on whether there isan AEC3 or not and if there—on the output of AEC3.

The echo may be cancelled and/or suppressed by first echo cancellationunit (AEC1) 65 that may apply one or more linear and/or non-linear echocancellation processes. Additionally or alternatively, AEC1 may applylinear and/or non-linear echo suppression processes.

It should be noted that successful implementation of AEC3 may avoid thegeneration of the acoustic leakage signals.

FIG. 7B illustrates various components (such as first microphone 41 andfirst speaker 31) and one or more compensation modules for performingcompensation operations related to some of the signals of FIG. 7Aaccording to an embodiment of the invention.

The first user first device echo 1043 is received by first microphoneand is converted to a converted first user first device echo that is fedto AEC1 65. AEC1 also receives first user second device digital leakagesignal 1142.

When the first person talks, signals 1031 and 1032 are received by themicrophones of the first and second devices.

Digital leakage signals 1141 and 1142 may be used by the AEC1 of thefirst and second devices as echo references.

If the digital leakage signals 1141 and 1142 are not cancelled with anyAEC3 then the desired signal (signals 1011 and 1012) may be suppressedby the AEC1 of the first and second devices due to the existence ofnon—zero value echo references—which are the digital leakage signals.

In AEC1, output of AEC3 which gets 1142 as input is used as referencesignal. So when AEC 3 does not cancel the signal completely, AEC1 willtry to cancel the echo of the output of AEC3. On the other hand, thefirst person is still talking and generating signal 1011 as input toAEC1.

This generates a double talk situation which always exists when thefirst person is talking where AEC1 will do echo suppression and suppressthe desired signal of the first person.

AEC1 should apply a double talk detection which can also detect feedbacksituation in order not to clip the desired speech of the first person,

This unwanted suppression of signals 1011 and 1012 may result in areception, by the third person, of clipped and not intelligible speechsignal 1410.

While using AEC3 the feedback will be cancelled and feedback-free signalwill be used as reference to avoid clipping of the speech of the firstperson. AEC1 may include additional control mechanism to avoid thisclipping

FIG. 7C illustrates various components (such as first microphone 41 andfirst speaker 31) and one or more compensation modules for performingcompensation operations related to some of the signals of FIG. 7Aaccording to an embodiment of the invention.nm,

The output of first microphone 41 is fed to AEC1 65. AEC1 65 is also fedwith the output of AEC3 67. AEC1 65 outputs first user first devicedigital device output signal 1111.

AEC3 67 receives the first user second device digital leakage signal1142 and receives (from delay unit Dn 69) a Dn-delayed first user firstdevice digital device output signal 1111.

Delay unit Dn receives the estimate of Dn from Dn estimation module 68.

FIG. 8A illustrates a first person that talks during conference call, amixer, three devices and various signals (1011, 1012, 1053, 1054, 1111,1112, 1141, 1142, 1310 and 1410) that are generated during theconference call according to an embodiment of the invention.

When the first person talks and if the feedback is not cancelled withAEC3 of the other device or there exists still some residual feedback,it is played to the speaker of one device out of the first and seconddevices and is received as a cross echo of a microphone of anotherdevice out of the first and second device. A first feedback pathincludes first microphone, first device to mixer, mixer to second deviceand second speaker—see various signals (1011, 1111, 1141 and 1054) thatpropagate over the first feedback path. A second feedback path includessecond microphone, second device to mixer, mixer to first device andfirst speaker—see various signals (1012, 1112, 1141 and 1053) thatpropagate over the second feedback path.

There feedback paths result in a howling effect.

A cross echo (out of signal 1053 and 1054) may exist if the AEC3 of thedevice that outputted that cross echo did not exist or did not cancelthe feedback completely.

Successful implementation of AEC3 in each of the first and seconddevices may avoid the generation of the cross echoes and thus avoidfeedback and howling.

The howling may be cancelled by using linear and/or non-linear acousticfeedback canceller.

One or more feedback paths convey signals 1011, 1111, 1141 and 1054.Feedback cancellation is done based on signals 1011 and 1054. Signal1054 is a delayed version of signal 1011.

Additionally or alternatively, the echo cancelling may be executed on1111 (that results from signal 1011) and delayed version of signal 1111(that results from signal 1054).

Non-linear feedback is due to speaker non-linearity, vocoder andtime-variant delay

The network delay need to be estimated using cross correlation or anyother method.

An alternative method is a special linear and non-linear echocancellation (AEC4) since the digital leakage signals 1142 and 1141 arecorrelated to the cross echoes 1053 and 1054.

FIG. 8B illustrates various components (such as first microphone 41 andfirst speaker 31) and one or more compensation modules (such as fourthecho cancellation unit (AEC4 70)) for performing compensation operationsrelated to some of the signals of FIG. 8A according to an embodiment ofthe invention.

AEC4 70 is fed by the converted signal from first microphone (aconverted first used second device cross echo) and by a first usersecond device digital leakage signal 1142 and perform echo cancellationand/or echo suppression to output first user first device digital outputsignal 1111.

FIG. 8C illustrates various components (such as first microphone 41 andfirst speaker 31) and one or more compensation modules (AFC 71, delayunit Dn 69) for performing compensation operations related to some ofthe signals of FIG. 8A according to an embodiment of the invention.

AFC 71 is fed by the converted signal from first microphone (a convertedfirst used second device cross echo) and by delayed (by Dn) first userfirst device digital output signal 1111.

FIG. 9A illustrates various components (such as first microphone 41 andfirst speaker 31) and one or more compensation modules AEC1 65, AEC2 66,AFC 71, AEC3 67, Delay unit Dn 69, Delay unit Dtx 64, Delay unit Drx 62,Dn estimation module 68, Dtx estimation module 63 and Drx estimationmodule 61 of first device 21 according to an embodiment of theinvention.

The input of delay unit Drx 62 is fed by first device input digitalsignals such as signal 1131. Delay unit Drx 62 performs Drxcompensation. The duration of Drx can be learnt by Drx estimation module61 that is coupled to Delay unit Drx 62.

The output of Delay unit Drx 62 is coupled to a first input of AEC3 67.AEC3 67 is also fed by the output of Delay unit Dn 69.

The output of AEC3 67 is fed to first speaker 31.

Delay unit Dn 69 compensates for network delays and is fed by Dnestimation module 68 (that estimates Dn) and by an output of AFC 71.

The output of first microphone 41 is fed to a sequence of threecompensation modules that includes AEC1 65, AEC2 66 and AFC 71.

The output of AFC 71 is also fed to Delay unit Dtx 64. Delay unit Dtx 64performs Dtx compensation and outputs a Dtx compensated signal to firstI/O port 51. Delay unit Dtx 64 is also fed by Dtx estimation module 63.Dtx estimation module 63 estimated Dtx.

FIG. 9B illustrates various components (such as first microphone 41 andfirst speaker 31) and one or more compensation modules of a first deviceof a first person according to an embodiment of the invention.

FIG. 9B differs from FIG. 9A by including a combined first and secondecho cancellation units AEC12 72 instead of the pair of serially coupledAEC1 61 and AEC2 66.

FIG. 9C illustrates various components (such as first microphone 41 andfirst speaker 31) and one or more compensation modules of a first deviceof a first person according to an embodiment of the invention.

FIG. 9C differs from FIG. 9B by not including AFC 71. The output ofcombined first and second echo cancellation units AEC12 72 is fed toDelay unit Dn 69 and Delay unit Dtx 64.

FIG. 9D illustrates various components (such as first microphone 41 andfirst speaker 31) and one or more compensation modules (61, 62, 63, 64,68, 69, 71 and 72) of a first device of a first person according to anembodiment of the invention.

FIG. 10A illustrates a third person that talks during conference call, anear end network, a connecting network, three devices and varioussignals (1330, 1335, 1031, 1032, 1501 and 1502) generated during theconference call according to an embodiment of the invention.

FIG. 10B illustrates a person that talks during conference call, a nearend network, a connecting network, three devices and various signals(1011, 1012, 1512, 1511, 1336 and 1337) generated during the conferencecall according to an embodiment of the invention.

FIG. 11A illustrates a third person 13 and a first person 11 that talkduring conference call, a mixer 25, three devices 21, 22 and 23 thatinclude a second device 22 that has a turned off microphone(OFF), andvarious signals (1011, 1031, 1032, 1111, 1131, 1132, 1310, 1330, 1410and 1430) generated during the conference call according to anembodiment of the invention.

FIG. 11B illustrates a third person 13 and a first person 11 that talkduring conference call, a mixer 25, three devices 21, 22 and 23 thatinclude a second device 22 that has a turned off speaker (OFF), andvarious signals (1011, 1012, 1031, 1111, 1112, 1131, 1132, 1310, 1330,1410 and 1430) generated during the conference call according to anembodiment of the invention.

FIG. 11C illustrates a third person and a first person that talk duringconference call, a mixer, three devices that include a second devicethat has a turned off (OFF) microphone and a first device with a turnedoff (OFF) speaker, and various signals (1011, 1012, 1032, 1111, 1131,1132, 1310, 1330, 1410 and 1430) generated during the conference callaccording to an embodiment of the invention.

FIG. 11D illustrates a third person and a first person that talk duringconference call, a mixer, three devices that include a second devicethat has a turned off (OFF) microphone, a third device with a turned off(OFF) microphone, and a first device with a turned off (OFF) speaker, aswell as various signals (1011, 1012, 1032, 1111, 1131, 1141, 1310, 1410and 1430) generated during the conference call according to anembodiment of the invention.

FIG. 12A illustrates a second person that talks during conference call,a mixer, three devices, and various signals (1330, 1601, 1602, 1603,1604, 1605, 1606, 1607 and 1608) generated during the conference callaccording to an embodiment of the invention.

Any reference to the term “comprising” or “having” should be interpretedalso as referring to “consisting” of “essentially consisting of”. Forexample—a method that comprises certain steps can include additionalsteps, can be limited to the certain steps or may include additionalsteps that do not materially affect the basic and novel characteristicsof the method—respectively.

The invention may also be implemented in a computer program for runningon a computer system, at least including code portions for performingsteps of a method according to the invention when run on a programmableapparatus, such as a computer system or enabling a programmableapparatus to perform functions of a device or system according to theinvention. The computer program may cause the storage system to allocatedisk drives to disk drive groups.

A computer program is a list of instructions such as a particularapplication program and/or an operating system. The computer program mayfor instance include one or more of: a subroutine, a function, aprocedure, an object method, an object implementation, an executableapplication, an applet, a servlet, a source code, an object code, ashared library/dynamic load library and/or other sequence ofinstructions designed for execution on a computer system.

The computer program may be stored internally on a non-transitorycomputer readable medium. All or some of the computer program may beprovided on computer readable media permanently, removably or remotelycoupled to an information processing system. The computer readable mediamay include, for example and without limitation, any number of thefollowing: magnetic storage media including disk and tape storage media;optical storage media such as compact disk media (e.g., CD-ROM, CD-R,etc.) and digital video disk storage media; nonvolatile memory storagemedia including semiconductor-based memory units such as FLASH memory,EEPROM, EPROM, ROM; ferromagnetic digital memories; MRAM; volatilestorage media including registers, buffers or caches, main memory, RAM,etc. A computer process typically includes an executing (running)program or portion of a program, current program values and stateinformation, and the resources used by the operating system to managethe execution of the process. An operating system (OS) is the softwarethat manages the sharing of the resources of a computer and providesprogrammers with an interface used to access those resources. Anoperating system processes system data and user input, and responds byallocating and managing tasks and internal system resources as a serviceto users and programs of the system. The computer system may forinstance include at least one processing unit, associated memory and anumber of input/output (I/O) devices. When executing the computerprogram, the computer system processes information according to thecomputer program and produces resultant output information via I/Odevices.

In the foregoing specification, the invention has been described withreference to specific examples of embodiments of the invention. It will,however, be evident that various modifications and changes may be madetherein without departing from the broader spirit and scope of theinvention as set forth in the appended claims.

Moreover, the terms “front,” “back,” “top,” “bottom,” “over,” “under”and the like in the description and in the claims, if any, are used fordescriptive purposes and not necessarily for describing permanentrelative positions. It is understood that the terms so used areinterchangeable under appropriate circumstances such that theembodiments of the invention described herein are, for example, capableof operation in other orientations than those illustrated or otherwisedescribed herein.

Those skilled in the art will recognize that the boundaries betweenlogic blocks are merely illustrative and that alternative embodimentsmay merge logic blocks or circuit elements or impose an alternatedecomposition of functionality upon various logic blocks or circuitelements. Thus, it is to be understood that the architectures depictedherein are merely exemplary, and that in fact many other architecturesmay be implemented which achieve the same functionality.

Any arrangement of components to achieve the same functionality iseffectively “associated” such that the desired functionality isachieved. Hence, any two components herein combined to achieve aparticular functionality may be seen as “associated with” each othersuch that the desired functionality is achieved, irrespective ofarchitectures or intermedial components. Likewise, any two components soassociated can also be viewed as being “operably connected,” or“operably coupled,” to each other to achieve the desired functionality.

Furthermore, those skilled in the art will recognize that boundariesbetween the above described operations merely illustrative. The multipleoperations may be combined into a single operation, a single operationmay be distributed in additional operations and operations may beexecuted at least partially overlapping in time. Moreover, alternativeembodiments may include multiple instances of a particular operation,and the order of operations may be altered in various other embodiments.

Also for example, in one embodiment, the illustrated examples may beimplemented as circuitry located on a single integrated circuit orwithin a same device. Alternatively, the examples may be implemented asany number of separate integrated circuits or separate devicesinterconnected with each other in a suitable manner.

Also for example, the examples, or portions thereof, may implemented assoft or code representations of physical circuitry or of logicalrepresentations convertible into physical circuitry, such as in ahardware description language of any appropriate type.

Also, the invention is not limited to physical devices or unitsimplemented in non-programmable hardware but can also be applied inprogrammable devices or units able to perform the desired devicefunctions by operating in accordance with suitable program code, such asmainframes, minicomputers, servers, workstations, personal computers,notepads, personal digital assistants, electronic games, automotive andother embedded systems, cell phones and various other wireless devices,commonly denoted in this application as ‘computer systems’.

However, other modifications, variations and alternatives are alsopossible. The specifications and drawings are, accordingly, to beregarded in an illustrative rather than in a restrictive sense.

In the claims, any reference signs placed between parentheses shall notbe construed as limiting the claim. The word ‘comprising’ does notexclude the presence of other elements or steps then those listed in aclaim. Furthermore, the terms “a” or “an,” as used herein, are definedas one or more than one. Also, the use of introductory phrases such as“at least one” and “one or more” in the claims should not be construedto imply that the introduction of another claim element by theindefinite articles “a” or “an” limits any particular claim containingsuch introduced claim element to inventions containing only one suchelement, even when the same claim includes the introductory phrases “oneor more” or “at least one” and indefinite articles such as “a” or “an.”The same holds true for the use of definite articles. Unless statedotherwise, terms such as “first” and “second” are used to arbitrarilydistinguish between the elements such terms describe. Thus, these termsare not necessarily intended to indicate temporal or otherprioritization of such elements. The mere fact that certain measures arerecited in mutually different claims does not indicate that acombination of these measures cannot be used to advantage.

While certain features of the invention have been illustrated anddescribed herein, many modifications, substitutions, changes, andequivalents will now occur to those of ordinary skill in the art. It is,therefore, to be understood that the appended claims are intended tocover all such modifications and changes as fall within the true spiritof the invention.

While certain features of the invention have been illustrated anddescribed herein, many modifications, substitutions, changes, andequivalents will now occur to those of ordinary skill in the art. It is,therefore, to be understood that the appended claims are intended tocover all such modifications and changes as fall within the true spiritof the invention.

What is claimed is:
 1. A mobile device, comprising: a microphoneconfigured to detect sounds from an environment during a conference calland to convert the detected sounds to a first digital signal; acommunications port configured to transmit the first digital signal toat least a far-end device participating in the conference call, thefar-end device being located in a different environment than the mobiledevice; and an echo cancellation unit configured to: receive a seconddigital signal from a near-end device participating in the conferencecall, the second digital signal including sounds detected from theenvironment by the near-end device; and cancel the second digital signalbased at least in part on the first digital signal.
 2. The mobile deviceof claim 1, further comprising a speaker configured to output secondsounds based on the second digital signal.
 3. The mobile device of claim2, wherein the echo cancellation unit is configured to cancel the seconddigital signal based at least in part on a correlation between thesounds and the second sounds.
 4. The mobile device of claim 2, whereinthe echo cancellation unit is configured to mute the speaker.
 5. Themobile device of claim 2, wherein the echo cancellation unit isconfigured to manipulate a signal that is provided to the speaker. 6.The mobile device of claim 1, further comprising a delay unit configuredto delay the first digital signal by a delay to provide a delayed firstdigital signal.
 7. The mobile device of claim 6, wherein the echocancellation unit is configured to cancel the second digital signalbased at least in part on the delayed first digital signal.
 8. Themobile device of claim 6, further comprising a delay estimation unitconfigured to estimate the delay based on a correlation between thefirst digital signal and the second digital signal.
 9. The mobile deviceof claim 1, wherein the sounds detected from the environment by thenear-end device include at least some of the sounds detected from theenvironment by the microphone.
 10. A method for participating in aconference call, comprising: at a mobile device: detecting, via amicrophone, sounds from an environment during a conference call;converting the detected sounds to a first digital signal; transmittingthe first digital signal to at least a far-end device participating inthe conference call, the far-end device being located in a differentenvironment than the mobile device; receiving a second digital signalfrom a near-end device participating in the conference call, the seconddigital signal including sounds detected from the environment by thenear-end device; and cancelling the second digital signal based at leastin part on the first digital signal.
 11. The method of claim 10, furthercomprising outputting, via a speaker, second sounds based on the seconddigital signal.
 12. The method of claim 11, wherein cancelling thesecond digital signal comprises cancelling the second digital signalbased at least in part on a correlation between the sounds and thesecond sounds.
 13. The method of claim 11, further comprising muting thespeaker.
 14. The method of claim 11, further comprising manipulating asignal that is provided to the speaker.
 15. The method of claim 10,further comprising delaying the first digital signal by a delay toprovide a delayed first digital signal.
 16. The method of claim 15,wherein cancelling the second digital signal comprises cancelling thesecond digital signal based at least in part on the delayed firstdigital signal.
 17. The method of claim 15, further comprisingestimating the delay based on a correlation between the first digitalsignal and the second digital signal.
 18. The method of claim 10,wherein the sounds detected from the environment by the near-end deviceinclude at least some of the sounds detected from the environment viathe microphone.
 19. A controller, comprising: a processing system; and amemory storing instructions that, when executed by the processingsystem, causes the controller to: detect, via a microphone, sounds froman environment during a conference call; convert the detected sounds toa first digital signal; transmit the first digital signal to at least afar-end device participating in the conference call, the far-end devicebeing located in a different environment than the mobile device; receivea second digital signal from a near-end device participating in theconference call, the second digital signal including sounds detectedfrom the environment by the near-end device; and cancel the seconddigital signal based at least in part on the first digital signal. 20.The controller of claim 19, wherein the sounds detected from theenvironment by the near-end device include at least some of the soundsdetected from the environment by the microphone.